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Abstract 

This  research  focuses  on  numerically  solving  a  class  of  computationally  expen¬ 
sive  optimization  problems  that  possesses  a  unique  characteristic:  as  the  optimal 
solution  is  approached,  the  computational  time  required  to  compute  an  objective 
function  value  decreases.  This  is  motivated  by  an  application  in  which  each  objec¬ 
tive  function  evaluation  requires  both  a  numerical  fluid  dynamics  simulation  and  an 
image  registration  and  comparison  process.  The  goal  is  to  find  the  parameters  of  a 
predetermined  image  by  comparing  the  flow  dynamics  from  the  numerical  simulation 
and  the  predetermined  image  through  the  image  comparison  process.  The  general¬ 
ized  pattern  search  and  mesh  adaptive  direct  search  methods  were  applied  in  a  way 
that  employs  surrogate  functions  in  the  search  step  to  reduce  the  number  of  costly 
function  evaluations.  The  surrogate  functions  are  formed,  based  on  either  previous 
function  values  or  their  computational  times,  or  both.  The  solution  to  the  surro¬ 
gate  optimization  problem  can  be  solved  easily  and  provides  an  improved  solution 
quickly.  A  time  cut-off  parameter  was  also  added  to  the  objective  function  to  allow 
its  termination  during  the  comparison  process  if  the  computational  time  exceeds  a 
specified  threshold.  The  approach  was  tested  on  two  problems  using  the  NOMADrn 
and  DACE  MATLAB®  software  packages,  and  results  are  presented. 
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SURROGATE  STRATEGIES  FOR  COMPUTATIONALLY 
EXPENSIVE  OPTIMIZATION  PROBLEMS  WITH  CPU-TIME 
CORRELATED  FUNCTIONS 


1.  Introduction 

The  optimization  problem  considered  in  this  research  is  to  evaluate, 

min  f(x),  (1.1) 

where  /  :  Mn  — >  M  U  {±00}  is  computationally-expensive,  Q  =  {x  E  W1  :  l  <  x  <  u} 
and  l,  u  E  (MU{±oo})n  for  l  <  u.  The  objective  function  /  will  be  treated  as  a  “black 
box”  where  /  can  contain  different  properties  to  include  nonsmoothness,  discontinu¬ 
ity,  unknown  derivatives,  and  may  fail  to  return  a  value  for  x  EVL.  Furthermore,  the 
objective  function  also  possesses  a  unique  property:  the  computational  time  required 
to  compute  the  objective  function  decreases  as  objective  function  values  decrease. 

1 . 1  Motivation 

This  class  of  problems  is  motivated  by  an  application  in  which  a  single  function 
evaluation  consists  of  a  numerical  simulation  and  image  registration  process.  The 
two  processes  estimate  parameters  of  experimental  data  through  the  numerical  sim¬ 
ulation  of  fluid  dynamics  and  then  compare  two  images  through  a  metric  measuring 
image  registration.  This  chapter  reviews  these  processes  and  reveals  why  the  central 
processing  unit  (CPU)  time  decreases  as  the  solution  is  approached. 
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1.1.1  Numerical  Simulation  of  Fluid  Dynamics 

A  numerical  simulation  involves  several  steps  in  an  attempt  to  predict  the 
state  of  all  points  in  space  continuously  throughout  time.  From  the  observation  of  a 
process,  a  mathematical  set  of  equations  can  be  developed  to  describe  the  outcome 
of  the  process.  The  equations  can  then  be  solved  approximately  at  a  finite  number 
of  points,  or  discretized  [17],  at  any  given  time.  Furthermore,  from  these  sets  of 
known  equations  one  can  compile  sufficient  information  to  forecast  the  outcome  of 
an  unknown  situation. 

Fluids,  both  liquids  and  gases,  are  defined  as  substances  that  cannot  resist 
shear  stress  when  at  rest  [17].  The  movement  of  fluids  in  a  given  region  C  Mn,  n  G 
{2,  3}  is  governed  by  the  well-known  Navier-Stokes  equations,  which  are  comprised  of 
three  equations,  namely,  momentum  of  the  fluid  (1.2),  conservation  of  energy  (1.3), 
and  continuity  (1.4),  given  as  follows  (see  Griebel  et  al.  [17]): 


■u  +  (w  •  V)m  +  V  p  = 

Aw  +  (1  -  (3T)g, 

(1.2) 

dT 

1  1 

m+a-VT  = 

RePrAT  +  «' 

(1.3) 

div  u  = 

o, 

(1.4) 

where  u  G  M"  is  the  velocity  field,  p  G  K  is  the  pressure  in  the  region,  g  G  Mn 
indicates  body  forces,  Re  G  M  is  the  Reynolds  number  of  the  flow,  Pr  G  M  is  Prandtl 
number  of  the  flow,  /3  G  1  is  the  coefficient  of  thermal  expansion,  q'"  is  the  heat 
source,  and  T  is  the  temperature. 

The  Navier-Stokes  equations  cannot  be  solved  analytically.  To  solve  numeri¬ 
cally,  both  space  and  time  are  discretized,  and  a  finite  differences  scheme  is  applied. 
For  a  two-dimensional  case,  let  x  =  (x[kj  ,  y^j),  u  =  ,  v^)  and  g  =  ( gx,gy )  where 

i  =  1, ...  ,m  and  j  =  1 , ,m  represent  m2  spacial  points  in  the  region  at  iteration 
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k.  The  time  increment  is  denoted  by  St,  and  x  and  y  are  spatially  incremented  by 
Sx  and  Sy  respectively.  The  Laplace  operator  is  denoted  by 


a/  =  E 


ay 

dx\ ' 


Using  central  differences  and  first-order  difference  quotients,  the  discretization  of 
(1.2)-(1.4)  yields  the  equations: 


p(U  _ 


pA)  _ 

Gi,i  ~ 


p(U  _ 

bi,3  ~ 

p<(U  _ 


r  /  A Ui 

+  st  {-rT 

+ st  [nr 

St 


d  (u2 


dx 


1,3 


d  (uv) 


dy 


+  9x 


h3 


d(uv) 

dx 


1,3 


d(v 2 


dy 


ft  ( rp{n+l)  ,  rp(n+ 1)  \ 

bi,3  + -l  (i+1)d )  9x 

pf(U  ( rp(n+ 1)  ,  rp(n+ 1)  \ 

G*j  f-  2  V  1,3  '  ^i,U+1)) 


fc+l) 
M-  • 

—  pU)  _ 

<57 

(5  a: 

(fc+l) 
U;  ■ 

=  cff  - 

<57 

l,3 

<52/ 

(fc+l) 


■Plj  ) 

rtk+1A 


+  9y 


*,3 


(1.5) 

(1.6) 

(1.7) 

(1.8) 

(1.9) 

(1.10) 


dT 

~dt 


(n+l) 


+ 


h3 


d(uT) 

dx 


1  1 
Re  Pr 


d2T 

dx2 


+ 


1,3 


n 

+ 

i,3 


djyT) 

dy 


h3 


dy 2 


+  Qi , 


(fc+l)  D  (fc+l)  ,  (fc+l)  (fc+l)  o  (fc+l)  i  (fc+l) 

pbiw  -  2p..j  +  pgA  +  AT  -  2p>j  +  ho-i) 


(<5a:)2 

■  p(fc) _  p(fc)  pA) _ p»(fc) 

<5t  \  (5a:  <5a/ 


(<%)2 


(1.11) 


(1.12) 
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Equations  (1.11) — (1.12)  lead  to  a  numerical  algorithm  for  simulating  fluid  flow 
under  different  settings.  The  time  step  is  chosen  consistent  with  Griebel  et  al.  [17] 


c-,  .  ,  Re  .  _ 

dt  =  r  mm  I  —  I  n  +  TT 
oxz  oyJ 


-i 


PrRe 


1  1 

8x 2  5y2 


-i 


5x 


5y 


l^maccl 


(1.13) 


where  r  G  [0,1]  to  maintain  stability  and  prevent  oscillations.  The  resulting  nu¬ 
merical  algorithm  can  be  seen  in  Figure  1.1,  followed  by  Figure  1.2  representing 
different  flow  properties  extrapolated  from  the  algorithm  under  certain  parameters 
and  conditions. 


1.1.2  Image  Registration  and  Comparison 

Image  registration  is  the  transformation  of  an  image  into  a  related  image  [24], 
while  image  comparison  is  a  measurement  of  the  transformation  to  determine  the 
level  of  similarity  between  images  [6].  Modersitzki  [24]  shows  different  types  of 
geometric  transformations  possible,  to  include  both  parametric  and  non-parametric 
image  registration.  Consider  the  inner  product  space  of  squared  Lebesque-integrablc 
functions, 

L2( ft)  =  {/  :  -»•  R|  [  \f(x)\2dx  <  cx)}, 

with  the  inner  product  (•  ,  •)  defined  by 

( f,g)L2(tt)=  [  f(x)g(x)dx,  (1.14) 

Jn 

and  consider  the  transformation  of  an  image  T, 

Tu(x)  =  T(x  -  u(x)), 
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•  INITIALIZATION:  Let  t  =  0,  k  =  0,  choose  5t  according  to  (1.13),  and  assign 
initial  values  to  tend,  u,  v,  p,  T. 

•  WHILE  t  <  tend 

Compute  according  to  (1.11). 

Compute  and  according  to  (1.5)  and  (1.6). 

Compute  and  according  to  (1.7)  and  (1.8). 

Solve  simultaneously  according  to  (1.12)  for  all  i,j. 

Compute  u'tfV)  and  1  ^  according  to  (1.9)  and  (1.10). 
t  —  t  +  St 
k  =  k  +  1 


Figure  1.1  Numerical  Simulation  Algorithm 
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Figure  1.2  Numerical  Simulation  Image 
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where  u(x)  is  the  displacement  of  the  point  x.  The  objective  is  to  minimize  the  dis¬ 
tance  between  a  reference  image  R  and  a  template  image  T  through  an  optimal  warp 
transformation  Tu ,  as  defined  by  some  distance  measurement  D,  and  a  smoothing 
or  regularizing  term  S.  This  problem  is  given  by 


min  D  [R,  Tu\  +  aS  [w] , 


where  a  >  0,  and 


D[R,TU }  =  f(x,u(x)) 
jS'frt]  =  A[u\(x), 


(1.15) 


(1.16) 

(1.17) 


for  a  force  measurement  /  and  partial  differential  operator  A.  The  distance  mea¬ 
surement  uses  a  force  /  to  warp  T  into  R ,  creating  the  image  warp  transformation 
Tu.  The  regularizing  term  is  added  to  the  objective  function  to  differentiate  between 
possible  transformations,  since  the  minimum  distance  may  not  be  unique  and  one 
type  of  transformation  may  be  preferred  over  another.  Applying  the  Euler-Lagrange 
equations  to  (1.15)— (1.1T)  yields  the  system  of  nonlinear  differential  equations, 

A[u\(x)  -  f(x,u(x))  =  0, 


from  which  the  iteration  scheme, 

A[uk+1](x)  —  f(x,uk(x))  =  0,  (1.18) 
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is  constructed.  In  particular,  the  following  choices  for  .D,  S  and  A  are  used  consis¬ 
tently  with  Modersitzki  [24], 

D[R,TU]  =  -\\TU  -  R\\l2<Si) 

%]  =  iAuifdx 

z  1=1  ''n 
A[u]  =  A  2u, 

where  ||  ■  ||l2(0)  is  the  norm  induced  by  (1.14).  From  Asaki  [6],  the  iterative  scheme 
constructed  by  (1.18),  the  optimal  warp  found  can  then  be  measured  to  compare  the 
level  of  similarity  between  the  reference  image  and  template  images  dependant  on 
the  warp  transformation. 

Figure  1.3  shows  an  example  of  an  image  registration  of  two  different  simulated 
flows  of  heat  for  a  fluid.  The  top  left  picture  is  the  reference  image  R  for  a  given 
set  of  parameter  values,  the  top  right  is  the  template  image  T  for  a  different  set  of 
parameter  values,  the  bottom  left  is  the  warped  template  image  TUl  and  the  bottom 
right  is  the  difference  between  the  reference  image  and  the  warped  template  image. 

1.1.3  Objective  Function 

The  goal  is  to  find  the  parameters  of  a  predetermined  image  among  a  range 
of  choices  from  a  numerical  simulation.  The  objective  function  (1.1)  accounts  for 
both  the  numerical  simulation  and  the  numerical  image  registration  and  compari¬ 
son  methods  described  above  through  the  following  process.  The  objective  function 
uses  certain  input  parameters  (e.g.,  Reynolds  number)  in  the  numerical  simulation, 
resulting  in  an  image  that  can  be  compared  to  a  predetermined  image.  Based  on 
the  simulation  and  the  predetermined  image,  the  results  of  (1.15)  enable  the  deter¬ 
mination  of  the  difference  between  the  two  images.  Each  possible  image  from  the 
numerical  simulation  results  in  a  distance  value  (??).  The  smallest  distance  value 
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implies  that  the  parameter  set  for  the  numerical  simulation  match  those  of  the  pre¬ 
determined  image.  For  the  numerical  scheme  presented  for  image  registration  (1.18), 
if  the  images  are  very  similar,  only  a  few  iterations  are  necessary  to  transform  the 
template  image  into  the  reference  image  resulting  in  a  small  distance  value.  If  the 
images  are  farther  apart,  more  iterations  are  necessary,  which  would  increase  the 
computational  time  and  distance  value.  From  the  numerical  scheme  and  distance 
value  output,  a  direct  correlation  exists  between  the  objective  function  value  and 
the  amount  of  computational  time. 

1 . 2  Purpose 

The  intent  of  this  research  is  to  develop  an  efficient  strategy  for  numerically 
solving  the  class  of  optimization  problems  in  which  function  values  are  expensive 
to  evaluate,  but  become  less  so  as  a  solution  is  approached.  The  method  utilizes 
a  less  expensive  surrogate  function  and  a  direct  search  method  to  find  the  optimal 
parameter  values.  A  surrogate  function  can  be  thought  of  as  a  replacement  for  / 
that  exhibits  similar  behavior.  Surrogates  are  typically  used  when  a  function  /  is 
computationally  expensive  to  evaluate.  The  nonlinear  optimization  problem  will 
be  solved  by  means  of  a  direct  search  method,  which  does  not  rely  on  derivative 
information. 

1.3  Overview 

Chapter  2  outlines  relevant  literature  on  surrogates  and  direct  search  methods 
while  Chapter  3  develops  an  approach  using  both  methods.  Chapter  4  describes  the 
application  of  the  method  on  two  test  problems  and  their  results.  Chapter  5  finishes 
this  research  with  conclusions  and  recommendations  for  future  research. 
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2.  Relevant  Literature 


This  chapter  reviews  literature  related  to  the  method  developed  to  solve 
the  optimizations  problem  (1.1).  The  first  section  introduces  surrogate  functions, 
their  composition,  and  possible  implementations.  The  next  section  examines  pat¬ 
tern  search  algorithms  as  a  solution  method  for  nonlinear  optimization. 

2.1  Surrogate  Functions 

The  idea  for  surrogates  first  appeared  in  the  work  of  Schrnit  and  Miura  [29], 
who  refer  to  surrogates  as  “approximation  concepts.”  Booker,  et  al.  [13]  characterize 
a  class  of  problems  for  which  surrogate  functions  would  be  an  appropriate  approach, 
suggest  a  surrogate  composition,  and  set  forth  a  general  framework,  called  the  Sur¬ 
rogate  Management  Framework  (SMF),  for  using  surrogates  to  solve  optimization 
problems  numerically.  Most  surrogates  are  one  of  two  types:  simplified  physics  or 
response-based. 

2.1.1  Simplified  Physics:  Low-Fidelity 

A  simplified  physics  model,  also  known  as  a  low-fidelity  model,  makes  certain 
physical  assumptions  that  allow  for  a  reduction  in  the  computational  cost  but  with 
less  accuracy  in  the  solution  and  parameter  values.  The  assumptions  not  only  re¬ 
duce  the  computational  cost  through  estimating  or  eliminating  complex  equations, 
but  may  also  reduce  the  number  of  variables.  These  models  are  typically  problem- 
dependent  since  they  are  based  on  specific  characteristics  of  the  problem. 

Robinson  et  al.  [26],  take  an  approach  for  using  a  low-fidelity  model  as  a 
surrogate  through  two  transformations.  For  x  G  Mn,  x  G  Mn,  h  <  n,  let  g{x)  be  a  low- 
fidelity  model  for  a  computationally  expensive  function  f(x).  The  surrogate  y{x)  is 
the  composition  of  two  transformations  linking  f(x)  to  g{x).  The  first  transformation 
must  correct  the  surrogate  to  associate  the  difference  between  evaluations  of  f(x) 
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and  g{x).  The  correction  transformation  can  use  an  additive  term  (2.1)  to  update 
the  surrogate,  i.e., 


A(x)  =  f(x)  -  g(x)  =>  f(x)=g{x)  +  A(x).  (2.1) 

The  term  A(x)  is  approximated  by  a(x)  using  a  second-order  Taylor  series  around 
a  point  xc, 

a(x)  =  A(xc)  +  VA(xc)  +  ^(z  -  xc)T V2A{xc)(x  -  A(xc))  (2.2) 

=  [/(*«)  -  g(£c)]  +  [V/(xc)  -  Vg(xc)] 

+  ^(x  -  xc)T[V2f(xc)  -  V2g(xc)](x  -  xc). 

Then  from  (2.1), 

y(x)  =  g(x)  +  a{x)  (2.3) 

=  g{x)  +  [f(xc)  -  g(xc)]  +  [V/(xc)  -  V^(fc)] 

+  ^(x  -  xc)T[V2f(xc)  -  V2^(Tc)](a;  -  xc). 

The  second  transformation  is  a  space  mapping  function  P  that  handles  the  change 
in  variable  dimensions  (see  Bandler  et  al.  [11]  for  details),  ft  is  defined  by 


x  =  P(x) 


(2.4) 


such  that 


\\g(x)  -  f(x)  ||  <  £ 


2-2 


where  ||  - 1|  is  a  suitable  norm  and  e  >  0.  Combining  transformations  (2.3)  and  (2.4), 
the  surrogate  function  becomes 

y(x)  =  g(P(x))  +  a(x). 

2.1.2  Response-Based  Model:  Design  and  Analysis  of  Computer  Experiments  (DACE) 

A  response-based  model  makes  no  assumptions  on  the  physics  involved  and  is 
constructed  from  known  responses  of  the  function.  For  a  model  to  be  based  on  the 
responses  of  the  function,  a  basic  structure  of  the  surrogate  must  be  assumed.  Sacks 
et  al.  [27]  discuss  a  way  to  build  a  model  using  both  a  generalized  least-squares 
regression  model  and  interpolation.  In  regression  analysis,  an  output  is  treated  as  a 
realization  of  a  stochastic  process,  since  for  the  same  input  parameters,  a  different 
output  value  can  occur.  A  computer  experiment,  however,  has  the  same  output  value 
for  the  same  input  parameters.  Using  the  deterministic  response  of  the  computer 
experiment,  it  is  treated  as  if  it  is  a  realization  of  a  stochastic  process  from  which  to 
build  a  regression  model.  The  variance  associated  with  the  generalized  least  squares 
regression  model  is  used  to  form  a  correlation  matrix  to  interpolate  the  regression 
model  through  the  deterministic  responses  at  known  data  sites.  The  generalized 
least  squares  model  can  then  act  as  a  predictor  of  the  response  at  an  unknown  point, 
based  on  the  known  points  and  their  known  function  values. 

Following  [27]  and  [22],  the  deterministic  function  y(z)  can  be  modeled  as  a 
realization  of  a  stochastic  process  Y(z),  which  is  composed  of  a  regression  model 
with  coefficients  (3  €  Mn  and  a  random  variable  Z  :  Mn  — >  M,  namely, 

n 

Y(z)  =  V  0sf,(z)  +  Z(z)  =  pTl(z)  +  Z(z).  (2.5) 

3= 1 
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The  random  function  Z(-)  is  assumed  to  have  a  mean  of  zero  and  covariance 

V(w,z)  =  <j2R(9,  w,  z)  (2.6) 

between  Z(w)  and  Z(z),  where  a2  is  the  process  variance  and  R(0,w,z )  is  the  cor¬ 
relation  between  w  and  z. 

Kriging  produces  an  approximate  function  value  at  an  unknown  point  using 
weights  on  known  responses.  Given  a  set  of  known  data  points  {xj},f=1  C  Mn  and 
their  response  {yx}x=i  G  Mm,  a  function  value  at  an  unknown  point  z  G  Mn  can  be 
approximated  from 


y(z)  =  c(x)Tyx  (2.7) 

where  c(x)  G  Mm  is  a  vector  of  weights.  The  best  kriging  weights  are  obtained  by 
minimizing  the  mean  squared  error  (MSE)  between  the  true  model  (2.5)  and  the 
approximate  model  (2.7),  given  by 

MSE\y{z)]  =  E[c(x)Tyx-Y(z)}2 
c{x)Tyx  -  Y(z)  =  c(x)T([3Tf(x)  +  Z(x))  -  {f3Tf(z)  +  Z(z)) 

=  c(x)TZ(x )  -  Z(z)  +  (f(x)Tc(x)  -  f(z))T f3. 

To  ensure  that  the  linear  predictor  is  unbiased,  the  condition 

f(x)Tc(x )  -  f(z)  =  0  (2.8) 

must  hold.  In  this  case,  it  follows  from  (2.8)  that 

c(x)Tyx  -  Y(z)  =  c(x)TZ(x )  -  Z(z), 
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and 


MSE[y(z)]  =  E[c(x)tZ(x)  -  Z(z)}2 

=  E[Z(z)2  +  c(x)TZ(x)Z(x)Tc(x)  -2c(x)tZ(x)Z(z)}.  (2.9) 

From  (2.6), 

E[Z(z)\  =  a 2 

E[Z(x)Z(z)\  =  V (x,  z)  =  a2R(0,  x,  z) 

E[Z(x)Z(x)t]  =  V(x,  x)  =  <j2R{9,  x,  x), 

which  simplifies  (2.9)  to 

MSE[y(z)}  =  a2  +  cr2c(x)TR(9,  x,  x)c(x)  —  2cr2 c(x)T R(9 ,  x,  z).  (2.10) 

The  optimal  weights  c(x)  can  be  found  by  solving  an  optimization  problem  formu¬ 
lated  from  (2.8)  and  (2.10),  namely, 

min  a2(  1  +  c(x)T R(9,  x,  x)c(x)  —  2 c(x)J  R{9 ,  x,  z)) 

C 

s.t.  f(x)Tc(x)  -  f(z)  =  0. 

To  solve  this  problem,  the  Lagrangian  function  L(c,  A),  with  multiplier  A  6  K  is 
formulated,  as 

L(c,  A)  =  a2(  1  +  c(x)tR(9,  x,  x)c(x)  —  2 c(x)TR(9,  x,  z)) 

-XT(f(x)Tc(x)  -  f(z)). 
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Taking  the  gradient  with  respect  to  c(x)  and  A,  the  first-order  necessary  conditions 
for  optimality  become 


R(9,x,x)c(x)  +  A  f(x) 

=  R(9,x,z), 

(2.11) 

A 

f(x)Tc(x) 

=  A/2cr2, 

=  m. 

(2.12) 

Solving  the  system  of  equations  in  (2.11)  and  (2.12)  yields 

A  =  (f(x)TR(9,x,x)~1f(x))~1(f(x)R(9,x,x)^1R(9,x,  z)  -  f(z))  (  2.13) 
c(x)  =  R(9,  x,  x)~1(R(9,  x,  z)  —  f(x) A).  (2-14) 

Substituting  (2.13)  and  (2.14)  into  the  predictor  (2.7), 

y(z)  =  c(x)Tyx 

=  R(6 ,  x,  x)~1(R(9,  x ,  z)  -  f(x)X)Tyx 

=  R(6 ,  x ,  z)T R(91  x ,  x)~1yx  -  ( f{x)TR(9 ,  x,  x)~1R(91  x,  z)  -  f(z))T 
C f(x)TR(9 ,  x,  x)-1  f  (x))^1  f  (x)T R(9 ,  x ,  x)~lyx 
=  f  {z)T (3*  +  R{0,  x,  z)T'y*,  (2.15) 

where 


(3*  =  ( f(x)TR(9,x,x )  1f(x))  1f(x)R(9,x,x)  1yx,  (2.16) 

7*  =  R{9,x,x)~1(yx  -  f{x)P*).  (2.17) 
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The  correlation  matrices  in  (2.15)-(2.17)  are  based  on  the  distance  between  all 
points  and  can  be  expressed  as 


R(0,  x,  x )  = 


R{6 ,  x,  z )  = 


^  R(6,xi,xi)  R(9,x i,x2)  •••  R(6,xi,xn)^ 

R(9,x2,x  i)  R(9,x  2,x2)  ...  R(9,x  2,xn) 


yR(9,xn,x i)  R(9,xn,x2)  ...  R(9,xn,xn)  J 

(  R(9,xuz )  R(9,x2,z)  ...  R(9,xn,z)  j 


for  known  points  Xj  G  Mn,  i  =  1 . . .  k  and  unknown  point,  z  G  Mn.  Then  R(9,  a,  b ) 
can  be  specified  as 

n 

R(9,a,b )  =  (2.18) 

j'=i 

dj  =  dj  —  bj  j  =  1 . . .  n 


for  any  points  a,  b  G  Mn.  Many  choices  for  the  function  Rj(9j,  \dj\)  have  been  pro¬ 
posed  to  allow  a  user  to  determine  the  amount  of  influence  that  known  points  should 
exert  on  an  unknown  point.  Common  choices  include  the  following  [22]: 


Rj(9j,  \dj\) 

= 

exponential 

:  exp(—9j\dj\) 

general  exponential 

■  exp  (-0j|d/"+1) 

0  <  9n+ 1  <  2 

gaussian 

:  exp(—9jd?) 

linear 

:  max{0,  1  — 9j \dj  } 

spherical 

:  1  -  1.5&  +  0.5$ 

0  =  min{l,  0j-|dj|} 

cubic 

:  1-3$ +  2$ 

0  =  minjl,  ?j\dj\} 

spline 

=  @3 

2-7 


^(0) 


1-15^  +  30^,  0  <  ^  <  0.2 
<  1.25(1 -e,)3,  0.2  <  £.,<  1 

k  o,  i  <  0- 


The  optimal  choice  for  9  is  the  maximum  likelihood  estimator  9*  that  solves 


min  | R(9,  x,  a;)|1//m<T2, 
6 


(2.20) 


where 


\R(9,x,x)\ 


det(R) 

—  {Vx  -  f(x)P*)TR(9,x,x)~1(yx  -  f(x)P*). 


2.2  Generalized  Pattern  Search 

Generalized  Pattern  Search  (GPS)  is  a  class  of  direct  search  methods  that 
generates  a  sequence  of  iterates  with  nonincreasing  function  values  to  numerically 
solve  optimization  problems  without  utilizing  derivative  information.  Torczon  [30] 
introduced  pattern  search  for  unconstrained  optimization  problems  as  a  generaliza¬ 
tion  of  several  well-known  methods,  including  the  method  of  Hooke  and  Jeeves  [18] 
and  the  multidirectional  search  algorithm  of  Dennis  and  Torczon  [16],  and  showed  if 
the  objective  function  /  is  continuously  differentiable  and  all  iterates  lie  in  a  com¬ 
pact  set,  then  a  subsequence  of  iterates  converges  to  a  first-order  stationary  point. 
Lewis  and  Torczon  later  extended  GPS  to  problems  with  simple  bounds  [19]  and 
linear  constraints  [20].  Audet  and  Dennis  [7],  developed  a  hierarchy  of  convergence 
results  for  GPS  that  depends  on  the  smoothness  properties  of  /,  while  second-order 
convergence  behavior  was  studied  by  Abramson  [3]. 

Further  extensions  include  work  to  manage  generally  constrained  optimization 
and  mixed  variables.  For  nonlinearly  constrained  optimization,  Audet  and  Dennis 
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introduced  a  filter  GPS  method  [9],  while  Lewis  and  Torczon  use  an  augmented 
Lagrangian  approach  [21],  both  of  which  allow  infeasible  points  in  the  iteration 
sequence.  Audet  and  Dennis  [8]  also  introduced  a  mixed  variable  GPS  algorithm 
for  bounded  constrained  problems  with  continuous  and  categorical  variables,  which 
was  extended  by  Abramson  [2]  using  Liters  for  problems  with  nonlinear  constraints 
and  mixed  variables. 

Audet  and  Dennis  [7]  describe  GPS  as  a  two  step  process  in  generating  a 
sequence  of  nonincreasing  function  values.  At  each  iteration,  GPS  executes  a  search 
and  poll,  which  are  executed  on  a  mesh.  The  optional  search  step  is  very  general 
in  that  it  simply  evaluates  a  finite  number  of  mesh  points,  and  can  be  implemented 
in  a  variety  of  ways.  This  allows  the  user  to  choose  a  specific  heuristic  suited  for  the 
optimization  problem.  Common  choices  include  the  use  of  surrogates  for  expensive 
objective  functions  or  a  random  search  of  the  region  if  nothing  is  known  about  the 
objective  function.  The  search  step  contributes  nothing  to  the  convergence  theory, 
but  is  often  successful  in  aiding  the  algorithm  find  a  quick  improvement. 

An  example  of  a  random  search  method  is  the  Latin  hypercube  design  which 
consists  of  a  random  set  of  “space  filling”  points.  This  is  done  by  dividing  each  of  n 
dimensions  into  m  intervals  of  equal  length,  where  m  is  the  number  of  points  desired. 
This  creates  mn  sectors  for  the  given  space.  Random  points  within  the  m  random 
sectors  are  chosen  such  that  each  column  for  each  dimension  is  chosen  only  once, 
as  described  by  Santner  et  al.  [28].  On  the  region  [0,1]  x  [0,1],  Figure  2.1  shows 
an  example  of  m  =  4  random  points  chosen  on  a  two-dimensional  Latin  hypercube 
design.  From  the  16  sectors,  4  random  points  are  chosen  so  that  no  row  or  column 
is  repeated. 

The  search  step  continues  until  no  further  improvement  in  /  can  be  found,  at 
which,  point  the  poll  step  is  invoked.  Polling  consists  of  an  examination  of  the  points 
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Figure  2.1  Latin  Hypercube  Design  (2-D,  4-Points) 

Pk  neighboring  the  current  solution  xk  on  a  mesh  Mk ,  which  is  defined  by 

Mk  =  {xk  +  A ™Dz  (2.21) 

where  A™  is  the  mesh  size  and  D  is  a  set  of  positive  spanning  directions  (defined  by 
Davis  [15]).  The  set  D  must  be  constructed  so  that  each  direction  dj  G  D  is  required 
to  be  the  product  Gzj  for  some  fixed  nonsingular  generating  matrix  G  €  Mnxn  and 

Zj  g  m1 . 

The  set  of  points  on  the  mesh  neighboring  the  current  solution  xk  is  called  the 
poll  set  and  can  be  expressed  as 

Pk  =  {xk  +  A  ™d:deDk} 

where  Dk  C  D  is  also  a  positive  spanning  set. 

From  the  poll  set,  points  are  evaluated  until  an  improvement  in  the  objective 
function  is  found  or  until  all  points  in  Pk  have  been  evaluated.  If  an  improvement  is 
found,  the  improved  point  becomes  the  new  iterate  and  the  mesh  size  can  be  relaxed 


’  ♦  ' 
(.88, .99) 

(■1,6) 

♦ 

(■7,3) 

♦ 

(.35,.  15) 

♦ 
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according  to  the  rule 


A 


m 
k+ 1 


TWk  A 


m 
k  ? 


(2.22) 


where  0  <  TWk  <  1,  for  r  G  Q,  wr  <  —1  and  Wk  €  [wr ,  —1]  fl  Z.  If  no  improvement 
is  found,  the  current  solution  is  retained  and  the  mesh  is  tightened  according  to 


A 


m 

fc+1 


TWk  A 


m 
k  ? 


(2.23) 


where  0  <  TWk  <  1,  for  r  G  Q,  wt  >  0  and  Wk  €  [0,iut]  ft  Z. 


2.2.1  Mesh  Adaptive  Directed  Search 

Mesh  Adaptive  Directed  Search  (MADS)  was  introduced  by  Audet  and  Dennis 
[10]  as  a  generalization  of  GPS  that  extends  to  nonlinear  constraints  without  the  use 
of  a  penalty  function  or  filter.  The  search  step  in  MADS  is  the  same  as  in  GPS;  the 
difference  lies  within  the  poll  step.  MADS  adopts  the  idea  of  a  frame  from  Coope 
and  Price  [14],  which  is  the  poll  set  in  GPS,  and  generates  an  asymptotically  dense 
set  of  refining  directions.  The  increased  number  of  directions  used  by  MADS  leads 
to  a  stronger  convergence  theory  than  that  of  GPS.  Audet  and  Dennis  [10]  provided 
convergence  to  a  first-order  stationary  point  for  general  nonlinearly  constrained  op¬ 
timization  problems,  even  in  the  nonsmooth  case.  Abramson  [4]  gives  reasonable 
conditions  under  which  convergence  to  a  local  solution  is  ensured.  In  GPS,  the  mesh 
size  parameter  A™  dictates  the  direction  and  magnitude  of  the  mesh  are  equal  for 
each  positive  spanning  set.  MADS  overcomes  this  rule  by  introducing  a  poll  size 
parameter  A^  such  that  A™  <  A^  for  updating  A™  for  all  k, 

lim  A?  =  0  =>  lim  AP  =  0.  (2.24) 

kei<  K  keK  K  v 
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The  updated  poll  set  of  frame  becomes, 


Pk  —  {%k  +  A™d  :  d  G  -Dfc}  (2.25) 

where  D is  a  positive  spanning  set  such  that  each  d  G  D must  have  three  properties 
(see  Audet  and  Dennis  [10]):  d  can  be  written  as  a  nonnegative  integer  combination 
of  the  directions  in  D:  d  =  Du  for  some  vector  u  G  that  may  depend  on  the 

iteration  number  k,  A™||d||  <  Aj(  max{||(f||  :  d!  G  D}  and  limits  of  the  normalized 
sets  Dk  are  a  positive  spanning  set,  Audet  and  Dennis  [10]. 

Figures  2.2  and  2.3  (from  Audet  and  Dennis  [10])  illustrate  the  difference  be¬ 
tween  GPS  and  MADS  frames.  For  GPS,  A)]  =  A™  and  for  MADS,  Apk  =  n^J A™, 
n  is  the  dimension.  In  Figure  2.2,  the  equal  frame  and  mesh  sizes  limit  the  number 
of  directions  GPS  can  investigate.  Figure  2.3,  however,  demonstrates  how  MADS 
allows  polling  in  increasingly  different  directions  and  magnitudes  as  the  mesh  size 
decreases  faster  than  the  poll  size.  This  yields  an  algorithm  similar  to  GPS,  as  seen 
in  Figure  2.4  [10]. 

2.3  Conclusion 

This  chapter  reviewed  the  relevant  literature  and  provided  an  introduction  to 
surrogates,  as  well  as  MADS,  to  solve  an  optimization  problem.  In  the  next  chapter,  a 
method  is  developed  using  these  ideas  to  numerically  solve  the  optimization  problem 
(l.i). 
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Figure  2.3  MADS  Frame 


A  GENERAL  MADS  ALGORITHM 

•  INITIALIZATION:  Let  xq  e  0,  A™  <  Aq,  D,  G,  t,  wri  and  Wt  satisfy  the 
requirements  above.  Set  the  iteration  counter  k  < —  0. 

•  SEARCH  AND  POLL:  Perform  the  search  and  possibly  the  poll  steps  (or  only 
part  of  them)  until  an  improvement  mesh  point  xk+i  is  found  on  the  mesh,  Mk 
(See  (2.21)). 

OPTIONAL  SEARCH:  Evaluate  fn  on  a  finite  subset  of  trial  points  on  the 
mesh,  Mk. 

LOCAL  POLL:  Evaluate  on  the  from  Pk.  (See  (2.25)). 

•  PARAMETER  UPDATE:  Update  A^  according  to  (2.22)  or  (2.23)  and  A|+1 
according  to  (2.24).  Set  k  < —  k  +  1  and  go  back  to  the  SEARCH 

AND  POLL  step. _ 

Figure  2.4  MADS  Algorithm 
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3.  Methodology 

This  chapter  describes  the  approach  for  solving  the  optimization  problem  pre¬ 
sented  in  Chapter  1  with  the  goal  of  solving  the  problem  as  quickly  as  possible.  This 
chapter  outlines  an  arrangement  between  the  use  of  surrogates  within  the  application 
of  a  pattern  search  algorithm. 

3.1  Optimization  Problem  and  Notation 

The  optimization  problem  from  Chapter  1  is  a  computationally  expensive, 
black  box  function, 


min  f(x)  (3.1) 

for  /  :  Mn  — »  M  U  {±oo},  O  —  {x  &  M.n  :  l  <  x  <  u}  and  !,«£  (I  U  {±oo})n  for 
l  <  u.  The  overall  approach  in  solving  (3.1)  numerically  is  the  implementation  of 
GPS  or  MADS  with  a  barrier  approach  [10].  The  application  of  the  barrier  forces 
f(x)  =  oo,  whenever  x  ^  fh  To  treat  (3.1)  when  /  is  computationally  expensive 
and  the  CPU  time  required  to  evaluate  /  at  a  point  x  decreases  as  x  approaches  a 
solution,  a  new  notation  is  first  introduced,  in  which  the  time  to  compute  a  function 
value  is  added  as  part  of  the  input  and  output;  i.e.,  [z,t\  =  f(x,tcut),  where  x  G 
is  a  trial  point,  tcut  is  a  user-specified  CPU  time  threshold,  z  is  the  function  value 
at  x  and  t  is  the  computational  time  needed  to  compute  z.  Once  the  computational 
time  of  a  function  evaluation  exceeds  the  value  specified  by  t,cut ,  then  evaluation  of 
/  is  aborted.  This  is  done  with  the  expectation  that  a  lower  objective  value  will  not 
be  produced  for  the  increased  time  required  for  the  evaluation.  Given  the  current 
minimal  solution  zs  at  iteration  n  with  a  computational  time  of  ts.  let  tcut  =  ts  for 
xn+\.  The  value  for  tcut  can  be  changed  after  each  iteration  to  help  reach  a  solution 
quickly. 


3-1 


3.2  Search:  Optimization  Surrogate 


The  search  step  for  a  pattern  search  method  also  make  use  of  the  time  feature. 
From  the  points  evaluated,  the  function  values  and  times  can  be  used  individually  as 
responses,  to  form  surrogates  F(x)  and  T(x),  respectively.  The  resulting  surrogate 
optimization  problem  is  then  solved  cheaply  at  each  search  step  to  find  a  point  at 
which  to  evaluate  /.  The  application  of  the  two  surrogates  are  tested  in  four  different 
configurations: 


1.  min  F(x),  (3.2) 

2.  minF(x)  (3.3) 

s.t.  T{x)  <  tcut  +  £, 


3.  minTYx),  (3.4) 

4.  minTYx)  (3.5) 

s.t.  F(x)  <  zs, 

where  the  constant  offset  e  is  added  to  the  constraint  in  (3.3)  to  allow  for  variability  in 
computational  time.  The  first  surrogate  is  a  typical  application  of  using  the  function 
values  to  construct  the  surrogates,  while  the  second  adds  a  constraint  surrogate  based 
on  time.  The  idea  is  to  make  sure  the  function  surrogate  and  time  surrogate  both 
see  a  decline  in  value  for  possible  input  values.  The  third  and  fourth  configurations 
switch  the  roles  of  the  two  surrogates  with  the  same  intent  of  finding  a  decline. 

To  solve  the  surrogate  optimization  problems  presented,  the  same  pattern 
search  method  is  applied.  The  method  to  solve  the  surrogate  problem  also  uses 
the  barrier  approach  along  with  a  nonlinear  closed  constraint  in  the  case  of  (3.3)  and 
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(3.5).  The  nonlinear  closed  constraint  is  treated  using  the  same  barrier  approach 
described  in  Section  3.1.  A  function  value  of  oo  is  assigned  for  all  infeasible  points. 
The  expectation  is  that  an  improved  solution  may  be  found  quickly  by  solving  the 
surrogate  problem,  rather  than  spending  the  time  in  the  poll  step. 

The  combination  of  a  pattern  search  method  with  a  barrier  approach  and  the 
use  of  the  parameter  tcut  in  the  original  optimization  problem  causes  a  dilemma  when 
using  surrogate  functions.  When  the  parameter  tcut  stops  the  evaluation  of  /  at  a 
point  x  E  Q,  the  function  value  is  unavailable,  and  a  value  of  oo  is  typically  assigned. 
However,  when  constructing  a  surrogate,  a  value  of  oo  is  not  viable  as  a  response. 
Therefore,  a  different  value  is  imposed  when  tcut  is  reached,  for  [zi,ti]  =  f(xi,tcut), 
if  ti  =  tcut  then  set  zt  =  ma x{z1? . . . ,  ^_i}. 

3.3  Surrogate  Composition 

For  the  surrogate  problems  developed  in  Section  3.2,  DACE  surrogates  from 
Section  2.1.2  were  implemented,  since  the  objective  function  (3.1)  is  treated  as  a  black 
box.  The  next  subsections  discuss  the  initial  points  used  to  generate  the  DACE  sur¬ 
rogates,  the  order  of  the  regression  polynomial,  and  the  correlation  function  chosen 
to  model  the  responses. 

3.3.1  Initial  Points 

A  set  of  initial  points  must  be  evaluated  and  then  used  to  generate  a  surrogate. 
From  a  global  perspective,  a  desirable  property  for  the  initial  points  is  that  they  be 
“space-filling” .  The  idea  is  to  sample  enough  points  in  the  given  space  to  construct 
a  reasonably  accurate  initial  surrogate.  Santner  et  al.  [28]  discuss  two  main  types 
of  space-filling  techniques:  experimental  and  Latin  hypercube  designs. 

Latin  hypercube  designs  were  discussed  in  Section  2.2  as  a  possible  step  in  the 
GPS  algorithm  to  generate  a  set  of  random  points  in  a  region.  The  points  generated 
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could  then  be  used  to  construct  the  initial  surrogate.  However,  when  considering 
the  structure  of  DACE  surrogates,  the  order  of  the  regression  model  dictates  more 
than  just  random  space-filling  points. 

In  an  experimental  design  approach,  Myers  and  Montgomery  [25]  assert  that 
there  are  several  characteristics  to  consider  when  choosing  an  initial  set  of  points  or 
design.  To  minimize  a  DACE  surrogate  and  obtain  a  proper  estimate  of  y(z)  for  the 
unknown  point  z,  a  second-order  polynomial, 

k  k  k  k 

y(z )  =  (3o  +  ^2  @iZi  +  X]  @iiZi  +  ^2  Pijzizj  +  R{0 ,  X ,  z)Try* 

i=  1  i=  1  i=  1  j<i 

is  used  where  (i,tj ,i,j  —  0, 1, . . . ,  k,  7*,  and  R(0 ,  x,  z )  are  defined  in  (2.16)-(2.18).  A 
second-order  polynomial  is  chosen  given  Myers  and  Montgomery  [25]  state  it  as  a 
more  accurate  predictor  of  y(z)  than  a  polynomial  of  a  lower  order  from.  To  properly 
estimate  the  coefficients  fy,  i,j  =  0,1,...  k,  of  the  linear  and  quadratic  terms,  a 
central  composite  design  (CCD)  is  to  determine  for  the  initial  set  of  points.  Figure 
3.1  is  an  illustration  of  a  2-dimensional  CCD  in  which  the  axial  and  center  points 
allow  for  an  estimation  of  quadratic  coefficients,  while  the  (±1,  ±1)  points  enable 
approximation  of  the  linear  terms  and  two-factor  interactions.  Normally,  a  CCD 
includes  three  replications  at  the  center  point,  but  this  is  not  done  here  because  there 
is  no  randomness  in  the  responses,  and  function  evaluations  are  computationally 
expensive. 
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Figure  3.1  Central  Composite  Design  (2-D) 


3.3.2  Correlation  Matrix 

The  choice  of  correlation  function  affects  the  performance  of  the  algorithm.  A 
common  choice  is  the  Gaussian  process  given  by 


R(6,a,b)  =  1]  I! Mr  ) 

3= 1 

Rj(0j, \dj\)  =  exp(-0jdj) 

dj  —  aj  —  bj 


(3.6) 


(3.7) 


for  any  points  a,  b  G  Mn.  Lophaven  et  al.  [22]  have  found  that  in  practice,  the 
Gaussian  process  often  shows  the  same  behavior  of  the  desired  function  as  the  number 
of  known  points  increases.  However,  when  solving  for  RiO^x^x)^1  (see  (2.16)  and 
(2.17)),  Booker  [12]  demonstrates  that  R(9,x,x )_1  can  become  ill-conditioned  as 
points  cluster  together  during  the  convergence  process  of  a  pattern  search  method. 
The  increase  in  the  condition  number,  k(R)  of  R(6 ,  x,  x),  results  in  a  loss  of  significant 
digits  which  can  greatly  impact  the  “optimal”  9  found  (see  2.20)  or  prevent  the 
calculation  of  (3  altogether.  For  specific  values  of  6  and  data  sites,  R(0,  x,  x)  is  a 
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symmetric  matrix  that  can  be  factored  via  Cholesky  decomposition: 


R  =  CCt 


and 


k(R)  =  K(Cf 


AL,(C) 
A  L„(C)  ’ 


where  A  max  and  \min  are  the  largest  and  smallest  eigenvalues  of  C,  respectively, 
obtained  from  the  diagonal  of  C.  The  condition  number  can  be  reviewed  periodically 
for  different  possible  values  of  9.  If  the  condition  number  is  too  large  the  search 
step  is  skipped  and  only  polling  is  executed. 


3.4  Implementation 

The  algorithm  presented  in  Figure  3.2  incorporates  the  ideas  developed  in  this 
chapter  to  solve  (3.1)  efficiently.  MADS-TIME  executes  a  pattern  search  method 
with  surrogates  based  on  function  values  and  computational  times  with  the  addition 
of  a  check  on  the  surrogates  condition  numbers  k(R)  and  an  updating  scheme  for 
the  time  cut-off  parameter  using  the  current  solutions  A,  computational  time.  The 
algorithm  will  be  applied  to  two  test  problems  in  the  next  chapter. 


3-6 


•  INITIALIZE 

Given  an  initial  set  of  points  {xj}”=1,  set  tcut  =  oo,  evaluate  [z^t)  = 
f(xi,  tcut),  i  =  1,  2  . . . ,  n.  Set  k  =  n,  s  =  1,  [£s,  fcut]  =  [min{zj}f=1,  L], 
and  Ms  satisfying  (2.21)  for  given  parameters  Af,  A"1,  D,  r  and  w.  Set  mesh 
size  stop  criteria  Astop  and  surrogate  condition  number  threshold  k. 

•  SEARCH 

For  all  known  responses  {ay,  ti,  2y}f=1,  build  surrogate  optimization  problem, 
o  If  k(R)  >  k  proceed  to  POLL. 

o  Else  k(R)  <  hi,  solve  surrogate  problem  yielding  xk+\,  and  evaluate 

[-^fc+l  j  ^fc+l]  f{Xk+l,  tcut)  • 

•  If  tk+i  =  tcut,  set  Zk+ 1  =  k  —  k  +  1,  proceed  to  POLL. 

•  Else  tk+i  i=-  tcut' 

o  If  Zk+\  <  zs  an  improvement  has  been  found.  Set  k  =  k  +  1,  s  = 
s  +  1,  zs  —  Zk,  tcut  =  tk,  update  A”1  (2.22),  A§  (2.24),  and  return 
to  SEARCH. 

o  Else  Zk+ 1  >  zs  and  no  improvement  has  been  found,  set  k  —  k  +  l, 
proceed  to  POLL. 

•  POLL 

o  If  Zk+i  >  zs  V  Xk+ 1  €  Ps  Set  s  —  s  +  1,  zs  —  zs-i  and  update  A™  (2.23), 
A§  (2.24),  and  proceed  to  CONVERGENCE, 
o  Else  evaluate  a  point  xk+i  £  Ps  (2.25),  [zk+1,tk+i]  =  f(xk+i,  tcut). 

•  If  tk+ 1  =  t^,  sef  zk+ i  =  max{^j}^=i,  k  —  k  +  1,  return  to  POLL. 

•  Else  tk+i  i=- 

tcut’ 

o  If  zk+ 1  <  zs  an  improvement  has  been  found.  Set  k  —  k  +  1,  s  — 
s  +  1,  zs  =  zk,  tcut  =  tk,  update  A™  (2.22),  A§  (2.24),  and  return 
to  SEARCH. 

o  Else  zk+ 1  >  zs  and  no  improvement  has  been  found,  set  k  —  k  +  1, 
return  to  POLL. 

•  CONVERGENCE 

o  If  A?  <  A stop,  stop  convergence  criteria  has  been  met. 
o  Else  Af  >  A st0p,  return  to  SEARCH. 

Figure  3.2  MADS-TIME 
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4.  Implementation 

This  chapter  focuses  on  implementing  the  numerical  algorithm  presented  in 
Chapter  3  on  a  suite  of  test  problems,  and  reporting  numerical  results.  For  each 
scenario,  different  variations  of  the  algorithm  are  applied  and  compared  to  a  base 
case.  The  base  case  implementation  uses  GPS  with  no  search,  a  single  initial  point 
or  CCD  set  of  points,  and  tcut  =  oo.  This  allows  for  a  full  evaluation  of  all  points  and 
a  comparative  analysis  of  the  proposed  algorithm.  The  other  cases  are  the  partial 
and  full  implementation  of  the  algorithm  presented  in  Figure  3.2.  Depending  on  the 
solution  and  effectiveness  of  the  algorithm  presented  in  Figure  3.2,  MADS  may  also 
be  implemented. 

4-1  Processing  and  Coding 

From  Section  3.4,  the  algorithm  presented  was  run  on  a  Linux  operating  system 
using  two  MATLAB®  software  packages,  NOMADm  [4]  for  the  implementation  of 
GPS  and  MADS  algorithms  and  DACE  [23]  to  build  the  surrogates,  along  with 
custom  search  files.  NOMADm  requires  five  files  to  run  the  optimization  problem, 
four  of  which  set  up  the  parameters,  objective  function,  variable  bounds,  and  initial 
points.  The  fifth  file  sets  up  a  custom  search  for  optimizing  the  surrogate  problem. 
The  surrogate  problem  is  solved  by  a  recursive  call  to  the  NOMADm  optimizer  from 
within  the  search  step  using  four  surrogate  files.  Each  hie  can  be  seen  in  Appendix 
A. 

4-2  Test  1:  Lid-Driven  Cavity 

The  first  test  problem  considered  is  known  as  the  lid-driven  cavity  problem. 
For  a  given  two-dimensional  square  domain,  the  Navier-Stokes  equations  describe  a 
fluid  how  with  a  horizontal  velocity  force  on  one  edge.  At  time  zero,  the  fluid  is 
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at  rest.  Once  time  starts,  a  constant  horizontal  velocity  is  asserted  along  the  top 
edge,  causing  a  circular  pattern  of  flow  to  appear  within  the  fluid  over  time.  For 
different  Reynolds  numbers  and  simulation  lengths,  the  velocity  and  viscosity  of  the 
fluid  form  a  different  circular  heat  pattern  throughout  the  region.  At  one  particular 
Reynolds  number  and  simulation  length,  a  reference  image  of  the  heat  pattern  is 
captured  and  then  noise  is  added  into  the  image.  The  goal  is  to  run  a  simulation  for 
different  Reynolds  numbers  and  simulation  lengths,  capture  the  template  image,  and 
compare  the  template  and  reference  images  of  heat  in  an  attempt  to  determine  the 
original  Reynolds  number  and  simulation  length  set  for  the  reference  image.  Figure 
1.3  shows  the  reference  image  and  a  template  image  with  their  comparison. 

4-2.1  Initial  Runs 

The  base  case  of  GPS  with  no  search  was  applied  and  reached  a  solution, 
but  initial  attempts  using  the  MADS-TIME  failed.  After  studying  the  problem 
in  detail,  the  reason  for  failure  became  evident.  Figure  4.1  shows  the  relationship 
between  the  function  values  at  each  trial  point  and  the  CPU  time  required  to  compute 
it.  From  Figure  4.1,  it  is  clear  that  the  main  assumption  does  not  hold.  The 
computational  time  does  decrease  as  the  optimal  solution  is  reached,  but  only  below 
a  certain  function  value.  If  the  function  values  are  too  far  from  that  of  the  optimal 
solution  (i.e.  too  high),  the  computational  time  starts  to  decrease.  Since  the  initial 
point (s)  produced  function  values  above  200,  tcut  does  not  allow  the  computation  of 
a  value  between  25  and  200,  leading  the  GPS  algorithm  to  terminate  as  though  the 
optimal  solution  is  around  200  without  finding  the  true  optimum.  This  situation  was 
remedied  by  changing  the  tcut  parameter.  Given  a  current  minimal  solution  with  its 
computational  time  [zs,ts],  let  tcut  =  2 ts. 

The  preliminary  runs  of  the  surrogate  optimization  problem  provided  useful 
information.  However,  as  GPS  progressed,  the  surrogate  became  ill-conditioned,  due 
to  the  clustering  of  trial  points  as  the  mesh  size  is  reduced.  Even  though  measures 
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Figure  4.1  Test  Problem  1:  Function  and  Time  Correlation 

were  in  place  to  deter  the  possibility  of  ill-conditioning,  it  conld  not  be  prevented, 
and  the  loss  of  significant  digits  was  overwhelming.  To  combat  this,  a  measure  was 
added  to  stop  solving  the  surrogate  problem  once  computational  times  fell  below 
some  threshold  or  if  the  condition  number  was  too  high.  This  makes  sense  for  this 
class  of  problems,  since  the  objective  function  evaluations  are  no  longer  expected  to 
be  expensive. 

4-2.2  Results- GPS 

The  results  for  each  case  are  shown  in  Table  4.1.  The  first  three  are  base  cases 
with  no  search  step  and  tcut  =  oo.  The  first  has  one  initial  point  at  the  center  of 
the  constraint  region,  the  second  chooses  a  random  initial  feasible  point,  and  the 
third  generates  a  set  of  initial  points  from  a  CCD.  The  last  seven  cases  are  different 
variations  of  the  MADS-TIME  algorithm.  The  first  three  are  similar  to  the  base 
case  (no  search  step)  except  for  the  use  of  the  tcut  parameter  to  abort  expensive 
function  evaluations.  The  random  point  is  the  same  random  initial  point  that  was 
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Objective 

Reynolds 

Simulation 

Number  of 

Number  of 

Total  CPU 

Full  Time 

Search 

Initial  Point(s) 

Value 

Number 

Length 

Iterations 

Evaluations 

Time  (min) 

None 

Center 

0.60 

134.13 

4.76 

56.00 

123.00 

126.73 

None 

Random 

0.60 

134.12 

4.76 

58.00 

118.00 

178.31 

None 

CCD 

0.60 

134.13 

4.76 

40.00 

107.00 

196.58 

Cut  Time 

] 

None 

Center 

0.60 

134.13 

4.76 

80.00 

157.00 

257.67 

None 

Random 

0.60 

134.12 

4.76 

92.00 

162.00 

796.40 

None 

CCD 

0.60 

134.13 

4.76 

40.00 

107.00 

109.73 

F(x)  s.t.  T(x) 

CCD 

0.60 

134.19 

4.76 

73.00 

162.00 

135.78 

F(x) 

CCD 

0.60 

134.19 

4.76 

72.00 

165.00 

182.89 

T(x)  s.t.  F(x) 

CCD 

0.60 

134.19 

4.76 

52.00 

133.00 

74.00 

JM _ 

CCD 

0.60 

134.19 

4.76 

49.00 

127.00 

74.48 

Table  4.1  GPS:  Lid-Driven  Cavity  Results 

used  for  the  base  case.  The  final  four  employ  one  of  the  surrogate  optimization 
problems  (3.2),  (3.3),  (3.4),  or  (3.5).  For  each  implementation,  certain  information 
was  collected,  including  the  optimal  parameters  found  with  the  function  value,  the 
number  of  iterations  and  function  evaluations  executed,  and  the  overall  time  it  took 
to  find  the  solution. 

All  runs  found  the  optimal  solution  at  the  same  parameter  values.  The  quick¬ 
est  convergence  occurred  when  the  surrogate  time  function  was  minimized,  subject 
to  a  surrogate  constraint  based  on  function  values  (3.5).  Closely  behind  is  the  surro¬ 
gate  time  function  with  no  constraint  (3.4).  Analysis  of  the  runs  provide  interesting 
insight  and  shows  how  each  component  of  the  proposed  algorithm  affected  the  com¬ 
putational  time. 

Analysis  of  the  first  three  runs  (provided  in  Figure  4.1)  yields  the  following 
observations.  The  initial  center  point  took  more  function  evaluations  than  with  a 
random  initial  point,  but  the  former  took  significantly  less  time.  This  illustrates  the 
importance  of  getting  close  to  the  solution  quickly  so  that  the  time  to  evaluate  the 
objective  function  decreases  quickly  and  why  the  number  of  function  calls  is  not  a 
good  measure  of  the  effectiveness  of  the  different  implementations. 
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Figure  4.2  Center  Comparison 

The  results  with  no  search  demonstrate  the  importance  of  finding  an  appro¬ 
priate  initial  point  to  set  up  the  tcut  parameter.  With  only  one  initial  point,  the 
tcut  parameter  needs  several  iterations  to  build  enough  slack  to  allow  the  sequence 
of  points  to  overcome  the  phenomenon  seen  in  Figure  4.1.  The  extra  iterations  re¬ 
sult  in  a  different  path  to  a  solution,  which  requires  significantly  more  time.  Figure 
4.2  illustrates  the  differences  for  the  two-dimensional  case  with  simulation  length 
and  Reynolds  number  as  variables.  The  blue  line  representing  the  path,  and  blue 
diamonds  for  a  function  evaluation  with  no  tcut,  and  the  red  line  and  squares  rep¬ 
resent  with  tcut .  So  that  there  are  34  more  function  evaluations  and  the  paths  are 
not  the  same  when  the  tcut  parameter  was  implemented.  However,  the  results  using 
initial  CCD  points  and  no  search  shows  an  appropriate  point  for  tcutl  resulting  in  a 
faster  convergence.  Both  cases  that  use  the  CCD  initial  points  take  the  same  path 
to  the  solution,  but  using  the  tcut  parameter  saves  almost  90  minutes  of  time  saved 
compared  to  not  using  it. 

Analysis  of  the  search  surrogates  showed  useful  information  was  provided  in 
finding  the  optimal  solution.  The  use  of  the  constraint  in  both  surrogate  searches 
provides  an  additional  level  of  information  to  the  search,  leading  to  a  different  path 
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Number  of  Search 
Surrogates  Implemented 

Number  of  Lower 
Objective  Function  Value 

Percentage 

Cut  Time 

Search 

Initial  Points 

F(x)  s.t.  T(x)  CCD 

44 

23 

52.27% 

F(x) 

CCD 

34 

15 

44.12% 

T(x)  s.t.  F(x)  CCD 

15 

7 

46.67% 

T(x) 

CCD 

13 

5 

38.46% 

Overall 

106 

50 

47.17% 

Table  4.2  GPS:  Search  Surrogates  Usage 

and  quicker  convergence  time.  However,  the  different  search  surrogates  also  show 
that  quality  is  better  than  quantity.  The  search  surrogate  based  on  function  values 
with  a  time  constraint  (3.3)  provided  a  lower  objective  function  value  more  often 
than  the  surrogate  function  based  on  time  with  a  function  value  constraint  (3.5), 
but  (3.5)  converges  about  60  minutes  faster.  Table  4.2  shows  the  ratio  between  the 
number  of  times  the  surrogate  was  used  and  the  number  of  times  it  resulted  in  a 
point  with  a  lower  function  value. 

Figure  4.3  illustrates  the  quality  in  the  decrease  of  the  objective  function.  Each 
color  in  Figure  4.3  represents  a  different  search  surrogate  and  each  shape  demon¬ 
strates  whether  the  poll  or  search  step  found  the  improved  function  value.  The  figure 
shows  that  the  surrogate  objective  function  based  on  computational  time  provides  a 
lower  function  value  quicker  than  the  surrogate  objective  function  based  on  function 
values,  leading  to  a  faster  convergence  sequence. 

4-3  Test  2:  Barrier  Flow 

The  second  test  problem  applies  the  Navier-Stokes  equations  to  a  vertical  flow 
of  fluid  with  a  barrier  near  the  top  of  the  given  two-dimensional  region.  At  time 
zero,  the  fluid  is  at  rest.  When  time  starts,  an  initial  vertical  force  is  applied  to 
the  top  of  the  fluid  forcing  a  vertical  downward  reaction.  The  vertical  force  on  the 
fluid  passes  a  horizontal  barrier  plane,  causing  the  fluid  to  move  around  the  barrier 
and  to  sway  horizontally  after  passing  the  barrier,  while  maintaining  the  downward 
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♦  Poll  ■  Search 
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T(x)  X  Initial  Point 

Figure  4.3  Decreasing  Function  Value 
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Figure  4.4  Numerical  Simulation  of  Fluid  Barrier 

reaction.  Figure  4.4  illustrates  this  reaction.  For  different  input  parameters,  the 
fluid’s  vertical  and  horizontal  flow  in  the  region  demonstrates  different  states  that 
can  be  compared.  For  this  problem,  in  addition  to  the  two  variables  of  Reynolds 
number  and  simulation  length,  the  Prandtl  number  and  initial  vertical  velocity  are 
added  as  variables.  The  goal  again  is  to  determine  the  parameters  of  a  reference 
image. 

4-3.1  Results-GPS 

The  results  for  each  case  are  shown  in  Table  4.3.  The  runs  were  conducted 
in  a  similar  manner  as  the  previous  problem,  but  the  results  do  not  exhibit  similar 
behavior  as  seen  in  the  previous  test  problem.  The  base  case  using  a  center  initial 
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Objective 

Initial 

Reynolds 

Prandtl 

Simulation 

Number  of 

Number  of 

Total  CPU  1 

Full  TimelSearch 

Initial  Point(s) 

Value 

Velocity 

Number 

Number 

Lenqth 

Iterations 

Evaluations  Time  (min) 

None 

Center 

1074.07 

1.88 

140.00 

0.16 

76.00 

32.00 

160.00 

540.13 

None 

Cut  Time 

CCD 

1310.82 

1.56 

140.00 

0.16 

72.00 

22.00 

155.00 

476.00 

None 

Center 

1074.07 

1.88 

140.00 

0.16 

76.00 

32.00 

160.00 

541.05 

None 

CCD 

1310.82 

1.56 

140.00 

0.16 

72.00 

22.00 

155.00 

475.12 

F(x)  s.t.  T(x)  CCD 

1310.82 

1.56 

140.00 

0.16 

72.00 

22.00 

170.00 

529.86 

F(x) 

CCD 

1310.82 

1.56 

140.00 

0.16 

72.00 

22.00 

170.00 

529.62 

T(x)  s.t.  F(x)  CCD 

1222.01 

1.56 

175.56 

0.16 

87.88 

42.00 

254.00 

778.11 

T(x) 

CCD 

1121.07 

2.00 

154.86 

0.20 

100.00 

26.00 

170.00 

538.85 

Solution 

0.00 

1.00 

160.00 

0.10 

40.00 

_ 

_ 

Table  4.3  GPS:  Barrier  Flow  Results 

point  discovered  the  lowest  function  value  and  took  the  least  amount  of  time.  None 
of  the  evaluations  found  the  optimal  parameters. 

Figure  4.5  illustrates  a  partial  correlation  between  the  function  value  and  time 
for  the  function  evaluations  using  no  search  and  tcut  =  oo.  It  also  illustrates  the  small 
room  for  improvement  in  the  overall  computational  time  with  image  comparison 
times  between  a  fraction  of  a  second  to  six  seconds.  A  comparison  of  GPS  with  no 
search  shows  that  with  a  center  initial  point,  tcut  takes  an  additional  1  minute  to 
converge,  but  with  a  CCD  set  of  initial  points,  tcut  saves  1  minute.  The  variability  in 
computational  time  overcame  any  time  saved  by  the  tcut  parameter.  The  variability 
in  the  computational  time  can  also  be  seen  with  the  search  surrogates  using  an 
objective  surrogate  function  based  on  function  values  with  and  without  a  constraint. 
In  these  cases  the  search  and  poll  steps  evaluated  the  objective  function  at  exactly 
the  same  points  during  the  runs,  but  the  computational  time  was  off  by  only  0.2 
minutes. 

The  effectiveness  of  the  surrogates  in  finding  improved  points  is  shown  in  Table 
4.4.  Only  two  points  provided  an  improved  function  value  and  both  used  a  surrogate 
objective  function  based  on  time.  The  points  led  to  a  lower  objective  function  value 
than  if  they  were  not  used,  demonstrating  the  use  of  surrogate  functions  based  on 
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Figure  4.5  Test  Problem  2:  Function  and  Time  Correlation 


Number  of  Search 
Surrogates  Implemented 

Number  of  Lower 
Objective  Function  Value 

Percentage 

Cut  Time 

Search 

Initial  Points 

F(x)  s.t.  T(x)  CCD 

16 

0 

0.00% 

F(x) 

CCD 

16 

0 

0.00% 

T(x)  s.t.  F(x)  CCD 

15 

1 

6.67% 

T(x) 

CCD 

16 

1 

6.25% 

Overall 

63 

2 

3.17% 

Table  4.4  GPS:  Search  Surrogate  Usage 

time  may  be  more  appropriate.  Surrogate  functions  based  on  objective  function 
values  never  provided  an  improved  point  and  thus  merely  wasted  computational 
time. 


4-3.2  Results-MADS 

Table  4.5  shows  similar  results  using  the  MADS  algorithm,  which  are  no  bet¬ 
ter  than  those  of  GPS.  The  lowest  objective  function  value  was  from  the  MADS 
case  which  used  function  values  for  the  objective  surrogate  with  no  constraint  (3.2). 
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Objective 

Initial 

Reynolds 

Prandtl 

Simulation 

Number  of 

Number  of 

Total  CPU  1 

Full  TimelSearch 

Initial  Point(s) 

Value 

Velocity 

Number 

Number 

Length 

Iterations 

Evaluations  Time  (min) 

None 

Center 

1307.81 

1.63 

200.00 

0.19 

84.00 

16.00 

84.00 

227.32 

None 

Cut  Time 

CCD 

1261.07 

1.69 

204.00 

0.16 

64.00 

12.00 

92.00 

225.77 

None 

Center 

1307.81 

1.63 

200.00 

0.19 

84.00 

16.00 

84.00 

228.08 

None 

CCD 

1261.07 

1.69 

204.00 

0.16 

64.00 

12.00 

96.00 

226.11 

F(x)  s.t.  T(x)  CCD 

1401.52 

1.56 

140.00 

0.16 

72.00 

8.00 

83.00 

246.67 

F(x) 

CCD 

1219.01 

1.56 

174.00 

0.14 

72.19 

12.00 

112.00 

330.50 

T(x)  s.t.  F(x)  CCD 

1399.52 

1.56 

144.16 

0.16 

71.77 

10.00 

86.00 

287.32 

T(x) 

CCD 

1392.17 

1.56 

132.00 

0.16 

71.25 

10.00 

89.00 

273.12 

Solution 

0.00 

1.00 

160.00 

0.10 

40.00 

_ 

_ 

_ 

Table  4.5  MADS:  Barrier  Flow  Results 


Number  of  Search 
Surrogates  Implemented 

Number  of  Lower 
Objective  Function  Value 

Percentage 

Cut  Time 

Search 

Initial  Points 

F(x)  s.t.  T(x)  CCD 

8 

0 

0.00% 

F(x) 

CCD 

12 

0 

0.00% 

T(x)  s.t.  F(x)  CCD 

10 

1 

10.00% 

T(x) 

CCD 

10 

0 

0.00% 

Overall 

40 

1 

2.50% 

Table  4.6  MADS:  Search  Surrogate  Usage 

Unfortunately,  the  random  polling  directions  generated  by  MADS  was  the  determin¬ 
ing  factor  in  finding  an  improved  solution,  instead  of  the  surrogates.  The  random 
polling  directions  for  each  MADS  evaluation  were  set  to  a  specific  random  seed,  but 
the  recursive  call  of  MADS  during  the  surrogate  optimization  process  prevented  the 
use  of  the  same  directions  for  each  polling  step.  Table  4.6  shows  that  the  only  sur¬ 
rogate  to  provide  an  improved  function  value  at  any  time  was  the  constrained  time 
surrogate  (3.5),  and  it  turned  out  to  be  of  little  benefit  to  the  overall  process.  The 
other  surrogates  were  only  different  versions  of  MADS  with  a  CCD  set  of  initial  set 
of  points  and  no  search.  The  quicker  convergence  occurred  because  the  mesh  size 
parameter  is  reduced  at  twice  the  rate  of  GPS,  which  is  necessary  for  the  MADS 
convergence  theory. 
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Function  Value 


Figure  4.6  GPS:  Problem  2  Function  Values 
4-3.3  Results- Solution 

The  results  of  the  barrier  flow  test  problem  using  both  GPS  and  MADS  call 
into  question  the  effectiveness  of  the  MADS-TIME  algorithm.  To  investigate  this 
further,  an  additional  run  was  made  using  the  base  case  scenario,  but  started  at  the 
solution.  This  allows  for  an  evaluation  of  points  neighboring  the  solution  to  analyze 
the  fluctuations  in  the  function  values  and  computational  time. 

Figure  4.6  illustrates  how  impractical  it  is  to  find  the  optimal  solution  to  this 
problem.  As  the  mesh  size  decreases,  the  function  values  dynamically  increase  even 
though  the  neighboring  points  are  approaching  the  solution.  This  trend  leads  the 
algorithm  away  from  the  optimal  solution.  Figure  4.7  illustrates  that  the  image 
registration  times  do  not  follow  the  same  pattern.  As  the  iterations  continue,  the 
image  registration  times  become  less  dynamic. 
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Figure  4.7  GPS:  Problem  2  Time 

4-4  Review 

The  algorithm  presented  in  Chapter  3  was  applied  to  two  test  problems  with 
mixed  results.  The  first  problem  was  successfully  solved  while  the  second  was  not 
solved.  The  second  problem  , however,  turned  out  to  be  an  extremely  difficult  prob¬ 
lem.  Chapter  5  concludes  with  comments  on  solving  nonlinear  optimization  problems 
and  suggestions  for  future  work  on  the  implementation  of  the  algorithm. 
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5.  Conclusions  and  Future  Research 

The  research  in  this  thesis  provides  an  initial  investigation  in  solving  a  opti¬ 
mization  problems  in  which  the  objective  function  values  and  computational  times 
correlated.  Further  study  into  the  solution  and  parameters  established  can  lead  to 
improvement  in  the  application  of  the  algorithm.  This  chapter  offers  some  concluding 
remarks  and  suggests  some  potential  areas  of  future  research. 

5.1  Conclusion 

The  two  problems  studied  in  Chapter  4  show  an  underlying  complexity  in  non¬ 
linear  optimization;  surrogates  and  pattern  search  methods  are  highly  influenced  by 
the  behavior,  or  fluctuations,  in  the  objective  function.  The  test  problems  possessed 
the  function  value  and  time  correlation,  but  only  to  a  point  (see  Figures  4.1  and 
4.5).  The  effectiveness  of  the  algorithm  on  two  test  problems  demonstrates  mixed 
results. 

The  implementation  of  the  tcut  parameter  was  a  useful  way  to  reduce  the  time 
to  solve  the  optimization  problem.  The  first  test  problem  showed  that  this  parameter 
can  hinder  the  performance  if  the  time  assumption  does  not  hold,  but  saves  time  if 
the  time  assumption  does  hold.  The  amount  of  time  saved  appears  to  be  determined 
by  the  problem’s  time  property.  The  majority  of  the  time  required  to  execute  the 
objective  function  for  the  first  problem  is  used  by  the  image  comparison  algorithm, 
but  for  the  second  problem,  the  fluid  simulation  time  contains  the  majority  of  the 
computational  time.  While  tcut  can  be  used  to  stop  the  image  comparison  algorithm, 
it  cannot  stop  the  numerical  simulation,  since  the  image  comparison  requires  the 
image  obtained  from  the  full  simulation. 

Dynamic  volatility  and  fluctuations  in  the  nonlinear  objective  function  can 
greatly  impact  pattern  search  methods,  not  allowing  the  optimal  solution  to  be 
found.  Surrogate  functions  were  implemented  in  the  search  step  to  help  lead  the 


5-1 


iterative  process  quickly  to  a  region  containing  the  optimal  solution.  However,  with 
dynamic  fluctuations,  surrogate  functions  require  more  points  in  strategically  located 
positions  to  effectively  mimic  the  behavior  of  the  objective  function.  The  first  test 
problem  does  not  contain  too  many  fluctuations,  since  the  optimal  solution  was 
found  by  GPS  with  and  without  the  different  search  surrogates.  The  time  function 
seems  less  dynamic  and  generates  a  quicker  path  to  the  region  containing  the  optimal 
solution,  with  or  without  the  function  value  surrogate  constraint. 

The  second  problem  does  not  seem  to  have  the  same  simple  behavior.  For 
GPS  and  MADS,  the  surrogate  objective  function  based  on  function  values  never 
provides  an  improved  function  value.  The  surrogates  based  on  time  did  provide  im¬ 
proved  function  values,  and  a  study  of  the  iteration  sequence  shows  evidence  that  the 
objective  function  value  and  time  surrogate  functions  can  lead  the  iteration  sequence 
in  completely  different  paths.  While  the  surrogate  constraint  based  on  time  rarely 
impedes  the  iteration  sequence  to  solve  the  surrogate  based  on  function  values,  the 
opposite  cannot  be  implied.  That  is  for  both  GPS  and  MADS,  the  surrogate  con¬ 
straint  based  on  objective  function  values  frequently  restricts  the  directions  available 
to  the  surrogate  based  on  time.  The  implementation  of  the  two  surrogates  in  dif¬ 
ferent  variations  questions  which  value,  objective  function  values  or  computational 
time,  can  better  represent  the  true  behavior  of  the  objective  so  that  quicker  progress 
can  be  achieved  toward  the  optimal  solution.  The  main  reason  for  using  one  surro¬ 
gate  over  the  other  should  be  based  on  whether  or  not  the  fluctuations  are  expected 
to  be  less  for  one  output  than  the  other,  while  still  correctly  mimicking  the  behavior 
of  the  true  objective  function.  The  surrogate  based  on  the  less  dynamic  output  has 
a  better  chance  of  correctly  identifying  a  better  point.  A  poor  surrogate  objective 
function  fails  to  generate  good  trial  points,  while  a  poor  surrogate  constraint  may 
hinder  the  adequate  exploration  of  the  domain. 
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5.2  Future  Research 


The  merging  of  different  methods  to  synthesize  a  new  technique  for  solving  this 
class  of  problems  has  provided  new  areas  of  research.  Different  parameter  settings 
and  modifications  to  the  algorithm  could  provide  a  better  solution  in  less  computa¬ 
tional  time.  Simple  changes,  such  as  solving  all  the  proposed  surrogate  optimization 
problems,  (3.2)— (3.5) ,  in  the  search  step  could  lead  to  quicker  convergence.  Each 
choice  for  the  algorithm  was  made  based  on  the  current  literature,  but  clue  to  the 
unique  application,  the  most  effective  parameter  may  have  not  been  used. 

The  parameter  tcut  provided  an  effective  and  efficient  method  to  reduce  the  time 
required  to  find  a  solution.  This  parameter  is  controlled  by  the  user  and  the  approach 
used  in  this  thesis  was  simple  and  based  on  trial  and  error.  An  enhancement  to  the 
algorithm  can  be  made  by  developing  a  more  systematic  was  of  using  this  parameter 
to  overcome  or  detect  deficiencies  illustrated  in  Figure  4.1  to  minimize  the  time 
to  solve  the  optimization  problem.  Since  function  values  and  computational  times 
are  stored  in  order  to  construct  surrogates,  they  can  also  be  used  to  measure  the 
correlation. 

The  DACE  surrogates  implemented  had  many  choices,  including  the  degree  of 
the  polynomial,  initial  points,  and  the  correlation  function.  These  choices  led  to  a 
surrogate  yielding  varied  results.  A  higher  degree  polynomial  may  have  improved 
the  effectiveness  of  the  surrogate,  but  the  degree  of  the  polynomial  dictates  how 
may  initial  points  are  needed  and  the  number  of  evaluations  to  solve  the  surrogate, 
both  of  which  take  time  to  evaluate.  A  balance  between  the  number  of  the  initial 
points  evaluated  and  the  degree  of  the  polynomial  may  be  reached  through  different 
experimental  designs  such  as  a  face  center  cube  or  Plackett-Burman  (see  Myers  and 
Montgomery  [25]). 

The  choice  of  a  Gaussian  correlation  function  R(9j,\dj\)  to  determine  the 
amount  of  influence  between  known  and  unknown  points  was  chosen  based  on  recom- 
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mendations  from  the  literature,  but  one  of  the  other  choices  may  be  more  appropriate 
for  this  class  of  problems.  Particular  choices  of  the  correlation  function,  such  as  the 
exponential  function,  have  a  high  influence  factor  on  an  unknown  point  based  on 
known  points  and  may  enhance  the  DACE  surrogate’s  ability  to  imitate  the  desired 
behavior. 

Not  using  the  surrogates  when  ill-conditioning,  occurs  as  a  result  of  the  cluster¬ 
ing  of  points  during  the  pattern  search  method  was  a  simple  tactic.  However,  a  more 
effective  means  to  combat  this  problem  may  be  the  use  of  a  trust  region  in  which 
the  search  for  an  improved  point  is  limited  to  a  subregion,  D  G  fl  A  framework 
for  using  trust  regions  for  search  approximation  models  can  be  seen  in  the  work  of 
Alexandrov  et  al.  [5] .  Similar  to  this  approach  is  the  idea  of  limiting  the  number  of 
points  used  to  generate  the  DACE  surrogate  to  a  trust  region  based  on  the  frame 
size  K  or  mesh  size  A™.  A  relationship  between  the  number  of  points  within  a 
region  and  the  distance  between  the  points  would  dictate  the  size  the  trust  region. 
These  two  factors  are  considered  because  the  number  of  points  affects  the  desire  in 
mimic  the  true  function  and  the  distance  between  points  creates  the  ill-conditioning 
for  R(9,  x,  a;)^1  (see  Section  3.3.2).  Eliminating  points  extremely  far  apart  through 
the  use  of  the  trust  region  would  ease  the  ill-conditioning. 

Including  a  trust  region  or  different  parameter  settings  could  potentially  im¬ 
prove  performance  for  this  class  of  problems,  but  it  could  also  be  applied  to  problems 
with  the  opposite  behavior;  i.e.  when  the  computational  time  increases  as  the  opti¬ 
mal  solution  is  approached.  Implementations  such  as  this,  or  research  into  different 
parameters,  may  possibly  lead  to  great  strides  in  solving  these  types  of  CPU-time 
correlated  functions. 
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Appendix  A.  Appendix  1 

The  following  Appendix  contains  MATLAB®  code  for  executing  the  MADS-TIME 
algorithm  (3.2).  The  files  work  in  conjunction  with  the  NOMADm  [4]  and  DACE 
[23]  software  packages. 


A-l 


o\°  o\° 


OPTIMIZATION  PROBLEM:  Parameter  File  setup  for  fixed  and  recorded 
variables 


2'9-2'2-2'9-2'9-2'9-2'2-2'2-2'9-2'9-2'2-2'9-2'9-2'9-2'2-2'9-2'9-2'9-2'2-2'2-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'2-2'9-2'2-2'9-2'2-2'2-2'9-2'9- 

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


%  Variables: 

%  Param 
%  . count 

%  . sear 

%  .nosear 
%  .data 
%  .  x 

%  .  tc 

%  .LB 
%  .UB 
%  .thetaint 
%  .  reg 

%  . corr 

%  . stop 

%  .  kon 

%  .konnum 

-F 

o  .  L 

%  .twarp 


Iterate  Number 

Number  of  Times  the  Search  Surrogate  is  used 

Number  of  Times  the  Search  Surrogate  is  not  used 

Fixed  Parameter  Input 

Variable  Parameter  Input 

Time  Cut  Off  Marker 

Parameter  Lower  Bound 

Parameter  Upper  Bound 

Initial  Theta  estimate 

Degree  of  regression  polynomial  for  surrogates 

Correlation  matrix  for  surrogates 

Time  Threshold 

Condition  Number 

Condition  Number  Threshold 

Output  Function  Value 

Output  Time  Value 


2'2-2'2-2'2-9'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2-2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2- 

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  Param  =  cpuproblem_Param 

%  Initialize  Data 
Param . count =0 ; 

Param . sear=0 ; 

Param . nosear=0 ; 

Param. data=struct2cell (load ( 1  ref image 0 .mat 1 ) ) ; 
Param . x=0 ; 

Param . tc=Inf ; 

Param . LB  =  [ ]  ; 

Param . UB  =  [ ] ; 

Param . f =0 ; 

Param . twarp=0 ; 

Param . cf d=0 ; 

Param . stop=0 ; 

Param . konnum=10000 ; 

Param . kon=0 ; 

Param . thetaint=10*ones ( 1 , length ( [Param . LB] ) ) ; 
Param . reg=@regpoly2 ; 

Param. corr=@corrgauss; 

setappdata ( 0 , ' PARAM ' , Param) ; 
return 
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%  OPTIMIZATION  PROBLEM:  Objective  Function  File 

9'2'2-2'2'2'9-2'9'2'9-2'9-2'9-2'9'2'9-2'9'2'2-2'9'2'2-2'9'2'9-2'9'2'9-9'9'2'2-2'9'9'9-2'9-2'2-2'9'2'9-2'9'2'9-2'9'2'9-9-9'2'2-9'9'2'9- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


%  Variables: 
%  x 
%  fx 
%  tx 
%  f_max 
%  Param 
%  . count 

%  .data 
%  .  x 

%  .  tc 

9-  -F 

o  . 

%  .twarp 
%  . f_min 

%  .  go 

%  .initial 


Variable  Parameter  Input 
Output  Function  Value 
Output  Time  Value 
Largest  Function  Value 

Iterate  Number 

Fixed  Parameter  Input 

Variable  Parameter  Input 

Time  Cut  Off  Marker 

Output  Function  Value 

Output  Time  Value 

Current  Lowest  Function  Value 

Current  Time  of  Lowest  Function  Value 

Number  of  Initial  Points 


9-9'9-9'9-9'9-9'9-9'9-9'9-9'2-2'9-9'9-9'9-9'9-2'9-9'2-2'2-9'9'2'9-9'9'9'2-9'9'9'2-9'9-9'9-9'9-2'9-9'9-9'9-9'9-9'2-9'9-9'2-9'9'2'2-9'9- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  [fx]  =  cpuproblem (x) 

Param  =  getappdata ( 0 , 'PARAM'); 

Param . count=Param . count  +  1  ; 

%  Setup  Time  Cut  Off  Marker  if  all  initial  points  already  evaluated 
if  Param . count>Param . initial 

[ f_min, call ] =min (Param . f ( 1 : (Param . count -1 ) ) ) ; 

Param . go= (Param . twarp ( call ) ) ; 

Param . tc=2  * Param . go+ . 1  ; 

f_max=max (Param . f ( 1 : Param . initial ) ) ; 

end 

%  User  Supplied  Objective  Function 
[ fx, tx] =cost func (Param . data, Param . tc, x) ; 

%  Negative  Time  Value  is 
if  tx  <  0 

fx=f_max; 

tx . warp=Param . tc; 

end 

%  Record  Necessary  Information  and  Store 
Param . f (Param . count ) =fx; 

Param . x (Param . count ) =x; 

Param . twarp (Param . count ) =tx; 

Param . fmin (Param . count ) =f_min; 

setappdata ( 0 , ' PARAM ' , Param) ; 
return 
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2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2-2-2'2-2'2-2'2-2'2-2'2-2'2-2'2-2'2- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


%  OPTIMIZATION  PROBLEM:  Initial  Points  File 

2-2'9-9'2-9'2-9'9-9'9-9'2-9'2-2'2-9'9-9'9-9'2-2'2-2'9-9'2'2-9-9'2-2'9-9'2'9'9-9'2-9'9-9'2-2'2-2'2-2'2-2'2'2-2-2'2-2'9-9'2'2'9'2'2-2'9-9'2-2'S' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


%  Variables: 
%  Param 
%  .LB 
%  .UB 
%  .initial 
%  iterate 


Parameter  Lower 
Parameter  Upper 
Number  of  Initi 
List  of  Initial 


Bound 
Bound 
al  Points 
Points 


9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9'2'9-2'9-2'9-2'9'2'9-2'9-2'9-2'9'2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2-9-2'9-2'9-2'9-2'9-2'9-2'9-2'9- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  iterate  =  cpuproblem_xO 
Param  =  getappdata ( 0 , 1 PARAM ') ; 

%  Central  Composite  Design  for  n  Dimensions 
[n,  m] =size (Param . LB) ; 

W=zeros  ( 1 , n) ; 

X=sqrt (n) *eye (n) ; 

Y=-sqrt (n) *eye  (n) ; 

Z= [ — 1  -1;  1  -1;  -1  1;  11]; 
for  k=l : n-2 

Z= [Z, ones (size(Z,l),l)*-l;Z, ones  (size(Z,l),l)]; 

end 

DOE= [ W; X; Y; Z ] ; 

%  Normalize  Data  to  Upper  and  Lower  Bounds  of  Problem 
[a, b] =size (DOE) ; 
for  j=l : a 

for  k=l : n 

iterate(j) . x (k, 1 ) = ( (Param . UB (k, 1 ) - 
Param . LB (k, 1 ) ) / (2*sqrt (n) ) ) * ( (DOE ( j , k) +sqrt (n) ) ) +Param . LB (k,  1 )  ; 
iterate  (  j )  . p= { } ; 

end 

end 

Param . in it ial=length ( [iterate.x] ) ; 

setappdata ( 0 , ' PARAM ' , Param) ; 
return 


9-2'9-2'9-2'9'2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2-9-2'9-2'9-2'9'2'9-2'9-2'9- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  OPTIMIZATION  PROBLEM:  Bounds  File 

9-2'9-2'2-2'9-2'9-2'9-2'9'2'9-2'9-2'9-2'2-2'2-2'2-2'2-2'9-2'2-2'9'2'9-2'2-2'9-2'9-2'9-2'2-2'9-2'9-2'9-2'9'2'9-2'9-2'2-2'2-2'9-2'2-2'9-2'2-2'9- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Variables: 


%  n 

%  Param 
%  .LB 
%  .UB 


Problem  Dimension 

Parameter  Lower  Bound 
Parameter  Upper  Bound 


2-2'2-2'2'2'2-2'2-2'2-2'2'2'2'2'2-2'2-2'2-2'2-2'2-2'2-2'2'2-2-2'2-2'2-2'2'2'2-2'2-2'2-2'2'2'2-2'2-2'2-2'2-2'2-2'2-2'2-2-2'2'2-2'2-2'2-9'2-2'2- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 
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function  [A,l,u]  =  cpuproblem_Omega (n) 
Param  =  getappdata ( 0 , ' PARAM ' ) ; 

A  =  eye (n) ; 

1  =  Param. LB; 
u  =  Param. UB; 
return 


ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


%  SEARCH  SURROGATE:  MADS  Search  Surrogate  File 

2'2'2'9'2'2'9'9'2'9'2'9'2'2'2'9'2'9'2'2'2'2'2'9'2'9'2'9'2'2'9'9'2'9'2'9'2'2'2'9'2'9'2'2'2'2'9'2'2'9'2'2'2'2'2'9'2'2'2'2'2'2'2'9'2'2'2'2'2'2'9' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


%  Variables: 

%  Problem 
%  Options 
%  Param 
%  . sear 

%  .nosear 
%  .  go 

%  .stop 
%  .  kon 

%  . konnum 

%  .thetaint 
%  pcenter 
%  BestF 


Input  Parameters  Required  for  MADS 
Input  Parameters  Required  for  MADS 

Number  of  Times  the  Search  Surrogate  is  used 

Number  of  Times  the  Search  Surrogate  is  not  used 

Current  Time  of  Lowest  Function  Value 

Time  Threshold 

Condition  Number 

Condition  Number  Threshold 

Current  Surrogate  Theta  Solution 

Current  Solution  Variables 

Solution  to  Search  Surrogate 


2'2'9'9'2'2'9'9'2'2'2'9'2'2'9'9'2'2'2'2'2'2'2'9'2'2'9'9'2'2'9'9'2'2'9'9'2'2'9'9'2'2'9'9'2'2'2'2'2'2'9'9'2'2'9'2'2'2'2'9'2'2'9'9'2'2'2'2'2'2'2' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  S  =  searchsurrogate (pcenter ) ; 

Param  =  getappdata ( 0 , 1 PARAM 1 ) ; 

If  the  Solution  Time  and  Condition  Number  are  within  Thresholds, 
execute  Search  Surrogate 

if  (Param. go>Param. stop)  &&  (Param. kon<Param. konnum) 

Param. sear=Param. sear+1 ; 


Defaults 

Options 

Problem. 

Problem. 

Problem. 

Problem. 

Problem. 

Problem. 

Problem. 

Problem. 

Problem. 

Problem. 


=  mads_def aults (' Surrogate ')  ; 
=  Defaults . Options ; 
nameCache  =  ' SCACHE ' ; 
typeProblem  =  'STRUTH'; 


File.F  =  'mysearch'; 

File.O  =  ' mysearch_Omega ' ; 

File.X  =  ' mysearch_X ' ; 

File. I  =  ' mysearch_xO ' ; 

File.N  =  ' mysearch_N ' ; 

File.P  =  ' mysearch_Param ' ; 

File.C  =  ' mysearch_Cache . mat ' ; 
File.S  =  ' mysearch_Session . mat ' ; 
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Problem . File . H  =  'mysearch_History.txt' 
Problem . File . D  =  'mysearch_Debug.txt'; 
Problem . f Type  =  'M'; 

Problem. nc  =  0; 


%  Specify  Choices  for 
Options .nSearches 
Options . pollStrategy 
Options .pollOrder 
Options .pollCenter 
Options . pollComplete 
Options . NPollComplete 
Options . EPollComplete 


SEARCH 

=  0; 

=  '  Standard_2n ' 
=  'Consecutive' 
=  0; 

=  0; 

=  0; 

=  0; 


%  Specify  Termination  Criteria 


Options . Term .delta 

=  le-4; 

Options . Term .niter 

=  Inf; 

Options . Term . nFunc 

=  50000; 

Options . Term . time 

=  Inf; 

Options . Term . nFails 

=  Inf; 

%  Choices  for  Mesh  Control 

Options . deltaO 

=  0.1; 

Options . deltaMax 

=  Inf; 

Options .meshRefine 

=  0.5; 

Options .meshCoarsen 

=  2.0; 

%  Choices  for  Filter 

management 

Options . hmin 

=  le-8 ; 

Options . hmax 

=  1.0; 

%  MADS  flag  parameter 

values 

Options . loadCache 

=  i; 

Options . countCache 

=  i; 

Options . runStochastic 

=  0; 

Options . scale 

=  2; 

Options .useFilter 

=  i; 

Options . degeneracyScheme  =  'random' 
Options . removeRedundancy  =  1; 

Options . computeGrad  =  0; 

Options . saveHistory  =  0; 

Options . plotHistory  =  0; 

Options . plotFilter  =  0; 

Options . plotColor  =  'k'; 

Options . debug  =  3; 

Options . Sur . Term . delta  =  0.01; 
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%  Create  DACE  Surrogates 
[midthetaint,minkon]  =  mysearch_Param; 

Par am. thetaint=midthetaint ; 

Param. kon=minkon; 

iterateO.x  =  pcenter.x; 
iterateO . p= { } ; 

%  Execute  MADS  to  Solve  Search  Surrogate 

[BestF, BestI, RunStats, RunSet]  =  mads (Problem, iterateO , Options ) ; 

S . x=BestF . x; 

S .p={ } ; 

else 

S . x=pcenter . x; 

S .p={ } ; 

Param. nosear=Param. nosear+1; 

end 

setappdata ( 0 , ' PARAM ' , Param) ; 
return 


ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


%  SEARCH  SURROGATE:  Create  DACE  Surrogates 

9'9'2'2'9'9'2'9'9'9'9'9'9'9'2'9'9'9'2'9'9'9'2'9'9'9'9'9'9'9'2'9'9'9'2'9'9'9'2'2'9'9'2'2'9'9'2'2'9'9'2'9'9'9'2'9'9'9'2'9'9'9'2'9'9'9'9'9'9'9'2' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


%  Variables: 

%  kon 
%  Param 

o  .X 

9-  -F 

o  .  -L 

%  .twarp 
%  .LB 
%  .UB 
%  . thetaint 

%  .  reg 

%  .corr 
%  .  kon 

%  .theta 
%  .lob 
%  .  upb 

%  .dmodelF 
%  .dmodelT 
%  .C 


Condition  Number 

Variable  Parameter  Input 
Output  Function  Value 
Output  Time  Value 
Parameter  Lower  Bound 
Parameter  Upper  Bound 
Initial  Theta  estimate 

Degree  of  regression  polynomial  for  surrogates 

Correlation  matrix  for  surrogates 

Condition  Number 

Initial  Theta  estimate 

Theta  Lower  Bound 

Theta  Upper  Bound 

DACE  Function  Value  Surrogate 

DACE  Time  Value  Surrogate 

Cholesky  Decomposition 


ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  [midthetaint , minkon]  =  mysearch_Param 
Param  =  getappdata ( 0 , 1 PARAM ') ; 
kon=0 ; 
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%  Determine  Lower  Bound  on  Theta 
while  kon  <  10e3 

[ dmodel ] =dacef it (Param . x,  Param.f,  Param.reg,  Param.corr, 

Param . thetaint ) ; 

kon=  condest (dmodel . C) ; 

Param . thetaint=Param . thetaint/ 2 ; 

end 

thetaO=Param . thetaint ; 

Param . lob=2  * thetaO ; 

%  Determine  Lower  Bound  on  Theta 
dLnorm  =  inf; 
h  =  0.1; 

while  dLnorm  >  le-2 

thetal=thetaO+h*ones ( 1 , length (thetaO ) ) ; 

dmodelO=dacef it (Param. x, Param. f , Param. reg,  Param. corr,  thetaO) ; 
dmodell=dacef it (Param. x, Param. f , Param. reg,  Param. corr,  thetal) ; 
C0=dmodel0 . C; 

R0=C0*C0 ' ; 

Rl=dmodell . C*dmodell . C ' ; 
dR  =  (R1  -  R0)  .  /h; 

G=dmodel0 . gamma ' ; 

S=dmodel0 . sigma2  ; 

dL= (G ' *dR*G) / (2*S) -trace (CO ' \ (C0\dR) ) ; 
dLnorm  =  norm(dL); 
theta 0=2  * thetaO ; 

end 

Param . upb=thetaO/ 2  ; 

Param. theta  =  (Param . lob+Param . upb) ./2; 

%  Compute  DACE  models  for  Function  Value  and  Time 

[Param . dmodelF ] =dacef it (Param . x, Param . f , Param .reg, Param .corr, Param . thet 
a, Param . lob, Param . upb) ; 

[Param . dmodelT ] =dacef it (Param . x, Param . twarp, Param .reg, Param .corr, Param . 
theta, Param . lob, Param . upb) ; 
midthetaint=Param . dmodelF . theta; 

%  Compute  the  Condition  Numbers  of  the  Surrogates 
konl=  condest (Param. dmodelF . C) ; 
kon2=  condest (Param . dmodelT . C) ; 
minkon=min (konl , kon2 ) ; 

setappdata ( 0 , ' PARAM ', Param) ; 
return 


ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  SEARCH  SURROGATE:  Surrogate  Objective  Function  File 

9'9-9'9-9'9'9'9-9'9'9'9-9'9-9'9'9'9'9'9-9'9-9'2'9'9'9'2-9'9-9'2-9'9'9'9-9'9-9'9'9'9'9'9-9'9-9'9'9'9'9'9-9'9-9'9-9'9'9'9-9'9-9'9-9'9-9'9-9'9-9- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 
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%  Variables : 


%  F 

%  Param 
%  .dmodelF 


Surrogate  Variable  Parameter  Input 
Surrogate  Output  Function  Value 

DACE  Function  Value  Surrogate 


9'2'9'9'9'9'9'9'9'2'9'9'9'9'9'2'9'2'9'2'9'9'9'2'9'9-9'9-9'9'9'9'9'9'9'9'9'9'9'2'9'9'9'2'9'2'9'2'9'9'9'2'9'9'9'2'2'2'9'2'9'9'9'2'9'2'9'2'2'9'9- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  F=mysearch (x) 

Param  =  getappdata ( 0 ,  'PARAM'); 

%  Evaluate  DACE  Surrogate  for  Variable 
[F ] =pr edict or (x, Param . dmodelF )  ; 

setappdata ( 0 , ' PARAM ' , Param) ; 
return 


2'2'9'9-9'9-9'9-9'2'2'2'2'9-9'9-9'9-9'2'2'9'9'2'2-9'9'2'2'2'9'2'2'9'9'2'9'2'9'9'9-9'9'2'2'9'9'2'2'2'9'9'2'2'9'2'9'2'9'2'2'9'9'2'2'9'2'2'2'9'9- 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  SEARCH  SURROGATE:  Surrogate  Constraint  File 

2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'9'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2'2' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


%  Variables : 
%  x 
%  T 

%  Param 
%  .dmodelT 
%  .  go 


Surrogate  Variable  Parameter  Input 
Surrogate  Constraint  Output  Function  Value 


DACE  Time  Value  Surrogate 

Current  Time  of  Lowest  Function  Value 


2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'2'2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9'2'9-2'9-2'2-2'9-2'9-2'9-2'9-2'9-2'9-2'2-2'2-2'9-2'9-2'2-2' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  isfeasible=mysearch_X (x) 
Param  =  getappdata ( 0 , 'PARAM'); 


%  Evaluate  DACE  Surrogate  for  Variable 
T=predictor (x, Param. dmodelT)  ; 


%  Determine  if  point  meets  constraint 
if  T  <=  Param . go*2+ . 1 
isfeasible=l ; 

else 

isfeasible=0; 

end 


setappdata ( 0 , ' PARAM ' , Param) ; 
return 


2'9'9'9'2'9'9'9'9'9'9'9'9'9'9'9'2'9'9'9'2'9'9-9'9'9'9'9'9'9'9-9'9'9'9'9'9'9'9-9'9'9'9'9'9'9'9'9'9'9'9'9'2'9'9'9'9'9'9'9'2'9'9-9'2'9'9'9'9'9'9' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  SEARCH  SURROGATE:  Surrogate  Bounds  File 

2'9'2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'2'2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'2-2'9'2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9-2'9'2'9-2'2-2' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 
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%  Variables : 

%  n  :  Problem  Dimension 

%  Param 

%  . LB  :  Parameter  Lower  Bound 

%  . UB  :  Parameter  Upper  Bound 

9'9-9'9-9'9-9'9'9'9'9'9-9'9-2'9-9'9'9'2-9'9-2'2-9'9-9'9-9'9-9'2-9'9-9'9-9'9-2'2-9'9'9- 

ooooooooooooooooooooooooooooooooooooooooooo 

function  [A, l,u]  =  mysearch_Omega (n) 

Param  =  getappdata ( 0 , 'PARAM'); 

A  =  eye (n) ; 

1  =  Param. LB; 
u  =  Param. UB; 

return 
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